Towards Privacy-Preserving Speech Data Publishing
نویسندگان
چکیده
Privacy-preserving data publishing has been a heated research topic in the last decade. Numerous ingenious attacks on users’ privacy and defensive measures have been proposed for the sharing of various data, varying from relational data, social network data, spatiotemporal data, to images and videos. Speech data publishing, however, is still untouched in the literature. To fill this gap, we study the privacy risk in speech data publishing and explore the possibilities of performing data sanitization to achieve privacy protection while preserving data utility simultaneously. We formulate this optimization problem in a general fashion and present thorough quantifications of privacy and utility. We analyze the sophisticated impacts of possible sanitization methods on privacy and utility, and also design a novel method – key term perturbation for speech content sanitization. A heuristic algorithm is proposed to personalize the sanitization for speakers to restrict their privacy leak (p-leak limit) while minimizing the utility loss. The simulations of linkage attacks and sanitization on real datasets validate the necessity and feasibility of this work.
منابع مشابه
ارایه یک روش جدید انتشار دادهها با حفظ محرمانگی با هدف بهبود دقّت طبقهبندی روی دادههای گمنام
Data collection and storage has been facilitated by the growth in electronic services, and has led to recording vast amounts of personal information in public and private organizations databases. These records often include sensitive personal information (such as income and diseases) and must be covered from others access. But in some cases, mining the data and extraction of knowledge from thes...
متن کاملPrivacy Preserving Techniques for Speech Processing
Speech is perhaps the most private form of personal communication but current speech processing techniques are not designed to preserve the privacy of the speaker and require complete access to the speech recording. We propose to develop techniques for speech processing which do preserve privacy. While our proposed methods can be applied to a variety of speech processing problems and also gener...
متن کاملPrivacy - Preserving Data Publishing
The success of data mining relies on the availability of high quality data. To ensure quality data mining, effective information sharing between organizations becomes a vital requirement in today's society. Since data mining often involves person-specific and sensitive information like medical records, the public has expressed a deep concern about their privacy. Privacy-preserving data publishi...
متن کاملAn Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling
In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...
متن کاملA Survey of Privacy Preserving Data Publishing using Generalization and Suppression
Nowadays, information sharing as an indispensable part appears in our vision, bringing about a mass of discussions about methods and techniques of privacy preserving data publishing which are regarded as strong guarantee to avoid information disclosure and protect individuals’ privacy. Recent work focuses on proposing different anonymity algorithms for varying data publishing scenarios to satis...
متن کامل